Skip to content

[Opt](ai-func) Improving AI function performance#62494

Open
linrrzqqq wants to merge 4 commits intoapache:masterfrom
linrrzqqq:opt-ai-pr
Open

[Opt](ai-func) Improving AI function performance#62494
linrrzqqq wants to merge 4 commits intoapache:masterfrom
linrrzqqq:opt-ai-pr

Conversation

@linrrzqqq
Copy link
Copy Markdown
Contributor

@linrrzqqq linrrzqqq commented Apr 14, 2026

Release note

Improving the performance of AI functions through batch sending, embed controls the number of (text/file) items sent in a single batch through the variable embed_max_batch_size, and the remaining functions internally maintain a conservative context window.

The current sending format is similar to:

"input": [
    {"role": "system", "content": "system_prompt here"},
    {"role": "user", 
     "content": [
	{"idx": 1, "text": "xxx"},
        {"idx": 2, "text": "xxx"},
      ]
    }
]

performance:

-- AI_CLASSIFY
SELECT 
    COUNT(*) AS total_rows,
    SUM(IF(res = 'science', 1, 0)) AS excepte_eq_res
FROM (
    SELECT AI_CLASSIFY('deepseek-chat', str, ['science', 'sport']) AS res 
    FROM test_str
) t;

-- before 
+------------+----------------+
| total_rows | excepte_eq_res |
+------------+----------------+
|        100 |            100 |
+------------+----------------+
1 row in set (2 min 11.579 sec)

-- now
+------------+----------------+
| total_rows | excepte_eq_res |
+------------+----------------+
|        100 |            100 |
+------------+----------------+
1 row in set (10.487 sec)

-- AI_FILTER
SELECT 
    COUNT(*) AS total_rows,
    SUM(IF(res = 1, 1, 0)) AS zero_res_rows
FROM (
    SELECT AI_FILTER('deepseek-chat', str) AS res 
    FROM test_str
) t;

-- before
+------------+---------------+
| total_rows | zero_res_rows |
+------------+---------------+
|        100 |             0 |
+------------+---------------+
1 row in set (2 min 2.979 sec)

-- now
+------------+---------------+
| total_rows | zero_res_rows |
+------------+---------------+
|        100 |             0 |
+------------+---------------+
1 row in set (5.007 sec)

-- EMBED
select count(embed('qwen-embed', str)) FROM test_str;

-- before
+---------------------------------+
| count(embed('qwen-embed', str)) |
+---------------------------------+
|                             100 |
+---------------------------------+
1 row in set (4 min 4.888 sec)

-- now
set embed_max_batch_size = 10;
+---------------------------------+
| count(embed('qwen-embed', str)) |
+---------------------------------+
|                             100 |
+---------------------------------+
1 row in set (23.424 sec)

-- Multimodal_Embed
SELECT COUNT(EMBED('qwen_mul_embed', to_json(file))) FROM test_jpg2;
-- before: can't get results for a long time(over 20 mins).
-- now
set embed_max_batch_size = 20;
+----------------------------------------------------+
|                                               .... |
|                                               1152 |
+----------------------------------------------------+
1142 rows in set (1 min 13.577 sec)

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (2/2) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (2/2) 🎉
Increment coverage report
Complete coverage report

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found 1 blocking issue.

  1. be/src/exprs/function/ai/embed.h: text embedding batching breaks Gemini resources. _execute_text_embed() now batches multiple prompts into one build_embedding_request(inputs, ...) call, but GeminiAdapter::build_embedding_request() still serializes them into a single content object and parse_embedding_response() still returns a single embedding for the text path. Running embed() on multiple rows with a GEMINI AI resource will now fail the cardinality check (expected N got 1) or effectively only embed the last input. Please either keep Gemini on the old per-row path or implement Gemini's true batch text embedding protocol before enabling batching here.

Critical checkpoint conclusions:

  • Goal and correctness: The PR aims to improve AI-function performance via batching. That is only partially achieved because multi-row text embed() is no longer correct for all supported providers. Existing tests do not cover the failing Gemini text-embedding path.
  • Scope/minimality: The change is focused, but the generic text-embedding batching applies to providers with different protocol semantics.
  • Concurrency: No new thread-safety or locking issue identified; the path remains synchronous.
  • Lifecycle/static init: No special lifecycle or static initialization issue found.
  • Configuration: multimodal_embed_max_batch_file_count is added and forwarded to BE correctly through TQueryOptions.
  • Compatibility: No storage-format or persistence compatibility issue found; FE/BE query-option propagation looks complete for the new variable.
  • Parallel paths: Multimodal embedding and string AI functions were updated, but the provider-specific Gemini text embedding path was not handled consistently.
  • Special conditions/checks: The new multimodal input validation is reasonable.
  • Test coverage: Unit coverage improved, but there is no test for multi-row embed() with a GEMINI resource, which is the broken path here.
  • Test result files: Not applicable.
  • Observability: Existing observability is sufficient for this review; no blocker here.
  • Transaction/persistence/data writes/FE-BE variable passing: Not applicable beyond query-option forwarding, which is covered.
  • Performance: Batching should help supported providers, but this regression must be fixed first.
  • Other issues: No additional blocking issue confirmed beyond the Gemini regression.

Comment thread be/src/exprs/function/ai/embed.h
@linrrzqqq linrrzqqq force-pushed the opt-ai-pr branch 3 times, most recently from 831a82e to 75d0015 Compare April 15, 2026 04:22
@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 6.25% (2/32) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 75.90% (337/444) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.08% (20174/38008)
Line Coverage 36.65% (189966/518326)
Region Coverage 32.92% (147589/448341)
Branch Coverage 34.06% (64634/189786)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 74.02% (302/408) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.66% (27419/37222)
Line Coverage 57.31% (296134/516722)
Region Coverage 54.64% (247204/452454)
Branch Coverage 56.21% (106993/190355)

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one blocking issue.

  1. Critical checkpoint: Goal / correctness / tests
    Conclusion: The PR goal is to batch AI function requests for better performance, and the BE unit tests cover several batch-splitting and parsing cases. However, one provider-specific compatibility path is broken, so the implementation does not fully accomplish the goal safely.

  2. Critical checkpoint: Scope and focus
    Conclusion: The change is reasonably focused on AI batching plus the needed session/thrift plumbing.

  3. Critical checkpoint: Concurrency
    Conclusion: No new concurrency or locking risks were introduced in the modified paths. The execution remains row-batch local inside function evaluation.

  4. Critical checkpoint: Lifecycle / static initialization
    Conclusion: No applicable lifecycle or static initialization issues found in the changed code.

  5. Critical checkpoint: Configuration changes
    Conclusion: embed_max_batch_size is added as a forwarded session variable and wired into TQueryOptions. FE forwarding looks complete for the new option.

  6. Critical checkpoint: Compatibility / protocol changes
    Conclusion: FE/BE protocol compatibility for embed_max_batch_size is handled, but provider API compatibility is not fully preserved: Gemini embedding now emits a batched requests payload without any corresponding endpoint normalization or compatibility handling.

  7. Critical checkpoint: Parallel code paths
    Conclusion: Text AI functions and EMBED were both updated for batching. No missing sibling path stood out beyond the Gemini provider-specific compatibility issue below.

  8. Critical checkpoint: Special condition checks
    Conclusion: Input-size validation added for multimodal requests is appropriate and clear.

  9. Critical checkpoint: Test coverage
    Conclusion: BE unit tests were expanded well for batch parsing/splitting and multimodal request shapes. But there is no test proving Gemini works end-to-end against the configured endpoint shape after this request-format change.

  10. Critical checkpoint: Test result files
    Conclusion: Not applicable; no regression .out files changed.

  11. Critical checkpoint: Observability
    Conclusion: No new observability additions were needed for this scope.

  12. Critical checkpoint: Transaction / persistence
    Conclusion: Not applicable.

  13. Critical checkpoint: Data writes / atomicity
    Conclusion: Not applicable.

  14. Critical checkpoint: FE-BE variable passing
    Conclusion: The new embed_max_batch_size variable is forwarded to BE correctly.

  15. Critical checkpoint: Performance
    Conclusion: The batching approach should significantly improve throughput, but the Gemini compatibility regression blocks safe acceptance.

  16. Critical checkpoint: Other issues
    Conclusion: No other blocking issue confirmed beyond the inline finding.

Because of the Gemini compatibility regression, I’m requesting changes.

Comment thread be/src/exprs/function/ai/ai_adapter.h
@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 100.00% (2/2) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (2/2) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 85.81% (399/465) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.12% (20190/38007)
Line Coverage 36.69% (190245/518463)
Region Coverage 32.96% (147815/448441)
Branch Coverage 34.10% (64737/189857)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 78.32% (336/429) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.99% (26792/37216)
Line Coverage 55.03% (284392/516825)
Region Coverage 51.99% (235258/452532)
Branch Coverage 53.47% (101805/190412)

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found two blocking regressions in the new batching work.

  1. append_parsed_text_result() now treats any scalar text wrapped in [] as a batch array and errors out if it is not valid JSON. That breaks legitimate scalar outputs for existing functions, for example ai_mask explicitly asks the model to emit [MSKED], and ai_agg still routes arbitrary scalar text through adapter->parse_response().
  2. Gemini embed batching now always builds a batchEmbedContents-style request body (requests array), but normalize_endpoint() only rewrites bare .../v1 / .../v1beta roots. Existing AI resources configured with the documented explicit ...:embedContent endpoint will now send the wrong payload to the old method and fail.

Critical checkpoints:

  • Goal: batch AI requests for better performance. The PR largely achieves that for the happy path, but the two regressions above break existing valid inputs/configurations.
  • Scope/focus: mostly focused, though the shared response parser changed behavior for non-batched paths too (ai_agg).
  • Concurrency: no new shared-state concurrency issue found in the reviewed paths.
  • Lifecycle/static init: no new lifecycle or SIOF issue found.
  • Config: embed_max_batch_size is added and forwarded to BE correctly, but I did not see FE-side range validation.
  • Compatibility: Gemini embed endpoint compatibility is regressed for users with explicit :embedContent resources.
  • Parallel paths: the shared parser change also affects non-batched callers like ai_agg.
  • Tests: the new BE tests cover happy-path batching well, but they miss bracketed scalar outputs and explicit-Gemini-endpoint compatibility.

Comment thread be/src/exprs/function/ai/ai_adapter.h
Comment thread be/src/exprs/function/ai/ai_functions.h
// provider response -> choices[0].message.content = "[\"1\",\"0\",\"1\"]"
// this helper -> appends "1", "0", "1" into `results`
static Status append_parsed_text_result(std::string_view text,
std::vector<std::string>& results) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider maybe we can also use std::vector<string_view> to reduce str copy

Copy link
Copy Markdown
Contributor Author

@linrrzqqq linrrzqqq Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有生命周期问题, 最少也得拷贝一次

Comment thread be/src/exprs/function/ai/ai_adapter.h Outdated
Comment thread be/src/exprs/function/ai/ai_functions.h Outdated
@linrrzqqq
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking issues found in the current implementation.

Findings

  • be/src/exprs/function/ai/ai_functions.h: _init_from_resource() now uses DORIS_CHECK for missing AI metadata / unknown resource names / unsupported providers, which turns ordinary query errors into BE process aborts instead of returning Status.
  • be/src/exprs/function/ai/ai_adapter.h: append_parsed_text_result() rejects valid pretty-printed JSON arrays, so batched scalar functions can fail on valid model output.
  • be/src/exprs/function/ai/ai_adapter.h + be/src/exprs/function/ai/embed.h: Gemini multimodal batching loses the original content_type and rewrites many valid files to the wrong MIME type.
  • fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java: the new session variables are forwarded without FE-side validation, while BE silently coerces invalid values to different defaults.
  • be/src/exprs/function/ai/ai_functions.h vs be/src/exprs/aggregate/aggregate_function_ai_agg.h: endpoint normalization was added only to scalar AI functions; ai_agg still uses raw endpoints, so parallel AI paths now behave differently for the same resource.

Critical Checkpoints

  • Goal / proof: The PR goal is to improve AI-function throughput by batching requests and adding session controls. The happy-path UT coverage is decent for batch sizing and endpoint normalization, but it does not cover the blocking regressions above.
  • Scope / focus: The refactor is localized, but it changes request/response contracts, endpoint handling, and error semantics, so the effective behavioral scope is broader than the diff suggests.
  • Concurrency: No new lock-order or concurrent mutation issue stood out in the touched paths.
  • Lifecycle / static initialization: No new static-init issue found. ai_agg still depends on externally setting its static QueryContext*, which remains fragile but unchanged.
  • Configuration: New query/session knobs were added, but invalid values are accepted on the FE side and silently rewritten on the BE side, so session state can diverge from real execution behavior.
  • Compatibility / parallel paths: Scalar AI functions now normalize Gemini/base-version and legacy completions endpoints, but ai_agg does not, so equivalent resources no longer behave consistently across parallel AI code paths.
  • Special conditions / parsing: Batch result parsing is too strict for valid JSON formatting, and _init_from_resource() now crashes on recoverable user/config errors.
  • Test coverage: Missing coverage for whitespace-formatted batch JSON, invalid session-variable values, true MIME passthrough for multimodal Gemini inputs, and ai_agg endpoint-parity cases.
  • Observability: Existing error surfaces are probably sufficient once the crash regression is fixed; no separate observability blocker found.
  • Transaction / persistence / data-write correctness: Not applicable for this PR.
  • FE-BE protocol: The new thrift field is added and forwarded, but FE/BE validation semantics are inconsistent for invalid values.
  • Performance: The batching direction looks good, but the correctness and compatibility blockers above should be fixed before it is safe to merge.

std::shared_ptr<AIAdapter>& adapter) const {
const ColumnWithTypeAndName& resource_column = block.get_by_position(arguments[0]);
StringRef resource_name_ref = resource_column.column->get_data_at(0);
std::string resource_name = std::string(resource_name_ref.data, resource_name_ref.size);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using DORIS_CHECK here turns ordinary query/config errors into BE crashes. Before this patch _init_from_resource() returned Status for missing AI metadata, unknown resource names, or unsupported providers; after this change SELECT ai_sentiment('not_exist_resource', ...) aborts the process (the new UT even expects ASSERT_DEATH). These are not invariants, so this path should keep returning Status instead of asserting.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE will check it first, if BE cannot find ai_resource, it indicates there is a wrong/missing transmission

Comment thread be/src/exprs/function/ai/ai_adapter.h
Comment thread be/src/exprs/function/ai/ai_adapter.h
Comment thread fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java
Comment thread be/src/exprs/function/ai/ai_functions.h
@linrrzqqq
Copy link
Copy Markdown
Contributor Author

/review

HappenLee
HappenLee previously approved these changes Apr 16, 2026
Copy link
Copy Markdown
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run buildall

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found blocking issues in the new batching/window paths.

Findings:

  • be/src/exprs/function/ai/ai_adapter.h: build_embedding_request() was only expanded in derived adapters. The base virtual still takes 2 arguments, so the new overrides are invalid and existing 2-argument call sites stop compiling.
  • be/src/exprs/aggregate/aggregate_function_ai_agg.h: ai_context_window_size now makes ai_agg fail on small legal settings because handle_overflow() re-checks the same byte limit after replacing the context with the AI summary. The new unit test with value 8 expects an impossible success case.
  • be/src/exprs/function/ai/ai_functions.h: the new byte-window estimator ignores JSON escaping, so escaping-heavy prompts can still serialize above ai_context_window_size.
  • be/src/exprs/function/ai/embed.h: EMBED batching has the same raw-size accounting problem and can exceed the configured window after JSON encoding.

Critical checkpoint conclusions:

  • Goal / correctness: The PR tries to batch AI requests and expose embed_max_batch_size / ai_context_window_size. The current code does not fully achieve that goal because the embedding adapter interface is internally inconsistent, the ai_agg window handling regresses on low legal values, and the new byte window is not enforced against the actual serialized request size. The current tests do not prove these paths correct.
  • Scope / focus: The changes stay focused on batching and session-variable plumbing, but the adapter interface change is incomplete.
  • Concurrency: No new thread-safety or lock-order issue was evident in the touched code.
  • Lifecycle / static init: No special lifecycle or static-initialization issue was evident in the reviewed paths.
  • Configuration: The new session variables are added and forwarded FE->BE, but the BE implementations do not yet honor the configured byte window correctly in all touched paths.
  • Compatibility: I did not find a storage-format or FE/BE protocol compatibility problem in the changed code, but the new C++ virtual interface contract in AIAdapter is currently broken.
  • Parallel code paths: Scalar AI batching, EMBED, and AI_AGG were all updated, but the byte-window logic is inconsistent or insufficient across them.
  • Conditional checks: The new overflow and window checks are not sufficient as implemented; see inline comments.
  • Test coverage: Coverage improved for many happy paths and invalid session-variable values, but it still misses escaping-heavy batches, and one new AI_AGG test asserts behavior the implementation cannot provide.
  • Test result changes: The modified ai_context_window_size_session_variable_test expectation is not correct for the current logic.
  • Observability: No new observability blocker stood out.
  • Transaction / persistence / data-write correctness: Not applicable for this PR.
  • FE-BE variable passing: The new variables are passed through thrift correctly.
  • Performance: The batching direction is good, but incorrect size accounting can still push failures to the provider instead of batching predictably.
  • Other: No additional blocker beyond the items above.

Please address the inline comments before approval.

Comment thread be/src/exprs/function/ai/ai_adapter.h
Comment thread be/src/exprs/function/ai/ai_functions.h
Comment thread be/src/exprs/function/ai/embed.h
Comment thread be/src/exprs/aggregate/aggregate_function_ai_agg.h
Comment thread be/test/ai/aggregate_function_ai_agg_test.cpp
@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Apr 16, 2026
@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run buildall

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR’s goal is clear: batch AI scalar functions / embed requests for better performance and expose session variables to control batch sizing. The implementation is directionally good, but I found several blocking correctness issues before this can be merged.

Findings:

  1. Scalar AI functions now globally reinterpret any response text that looks like a JSON array as batched output, which regresses legitimate literal outputs such as [...] for single-row ai_generate / ai_summarize / ai_translate / etc.
  2. The new ai_context_window_size batching logic for scalar AI functions underestimates serialized prompt size because it ignores JSON escaping, so the configured byte window is not enforced conservatively.
  3. ai_agg now fails on the first oversized row by trying to execute the task with empty accumulated data.
  4. ai_agg still reads query-specific state from a static QueryContext*, so concurrent queries can bleed ai_context_window_size, timeout, and AI-resource state into each other.
  5. The added regression coverage validates SET constraints, but it never runs an AI query with non-default embed_max_batch_size / ai_context_window_size, so FE->thrift->BE propagation is still unproven.

Critical checkpoint conclusions:

  • Goal of the current task: Partially accomplished. Batching/session controls were added, but the issues above mean the new behavior is not yet correct in all supported cases.
  • Is the modification small, clear, and focused: Mostly focused on AI batching/session variables, but the generic adapter/parser refactor changed behavior for all scalar AI functions.
  • Concurrency: Not safe for ai_agg. Query-specific state still comes from a static pointer, so concurrent queries can interfere.
  • Lifecycle / static state: No new cross-TU static-init issue found, but ai_agg’s static QueryContext* lifecycle remains unsafe.
  • Configuration items: embed_max_batch_size and ai_context_window_size were added and forwarded, but current tests do not prove end-to-end propagation, and the scalar batching estimate does not conservatively honor the configured window.
  • Incompatible changes: The new thrift fields are additive, but scalar AI response parsing is now backward-incompatible for literal JSON-array strings.
  • Parallel code paths: embed, scalar AI functions, and ai_agg now each batch independently; oversized-input handling is inconsistent and ai_agg is incorrect on the first oversized row.
  • Special conditional checks: The new [...] auto-unpack condition is too broad because it runs in the generic adapter layer without batch-context gating.
  • Test coverage: Improved, but still missing coverage for literal JSON-array scalar outputs, escaped-size batching, first oversized ai_agg rows, concurrent ai_agg queries, and non-default end-to-end session-variable behavior.
  • Has the test result been added/modified: No .out updates were needed here; the new regression additions currently only validate setters, not runtime effect.
  • Observability: A scalar batch-size mismatch warning was added, but there is still no specific observability for ai_agg overflow/cross-query issues.
  • Transaction / persistence / data writes: Not applicable for this PR.
  • FE/BE variable passing: The new variables were added to thrift and consumed on BE paths, but end-to-end tests do not yet demonstrate that query execution actually sees the non-default values.
  • Performance: Batching should help, but the escaped-size undercount can still produce oversized requests, and the ai_agg static-context bug can corrupt behavior under load.
  • Other issues: None beyond the findings above.

Because these are correctness issues rather than tolerable nits, I’m requesting changes.

Comment thread be/src/exprs/function/ai/ai_adapter.h
Comment thread be/src/exprs/function/ai/ai_functions.h
Comment thread be/src/exprs/aggregate/aggregate_function_ai_agg.h Outdated
Comment thread be/src/exprs/aggregate/aggregate_function_ai_agg.h
Comment thread regression-test/suites/ai_p0/test_ai_functions.groovy
@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/445) 🎉
Increment coverage report
Complete coverage report

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.48% (1798/2291)
Line Coverage 64.14% (32291/50345)
Region Coverage 65.10% (16254/24967)
Branch Coverage 55.60% (8684/15620)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 92.47% (479/518) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.30% (20316/38118)
Line Coverage 36.79% (191403/520224)
Region Coverage 33.11% (148882/449697)
Branch Coverage 34.18% (65048/190294)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (8/8) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 85.24% (410/481) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.85% (27565/37326)
Line Coverage 57.58% (298606/518587)
Region Coverage 54.70% (248205/453788)
Branch Coverage 56.34% (107530/190851)

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run cloud_p0

@linrrzqqq
Copy link
Copy Markdown
Contributor Author

run external

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 85.24% (410/481) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.85% (27565/37326)
Line Coverage 57.58% (298606/518587)
Region Coverage 54.70% (248205/453788)
Branch Coverage 56.34% (107530/190851)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 85.24% (410/481) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.84% (27562/37326)
Line Coverage 57.57% (298536/518587)
Region Coverage 54.72% (248311/453788)
Branch Coverage 56.33% (107513/190851)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (8/8) 🎉
Increment coverage report
Complete coverage report

1 similar comment
@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 100.00% (8/8) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants